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Anaerobic ammonia-oxidizing (anammox) bacteria are able to oxidize ammonia and reduce 
nitrite to produce N2 gas. After being discovered in a wastewater treatment plant (WWTP), 
anammox bacteria were subsequently characterized in natural environments, including 
marine, estuary, freshwater, and terrestrial habitats. Although anammox bacteria play an 
important role in removing fixed N from both engineered and natural ecosystems, broad 
scale anammox bacterial distributions have not yet been summarized. The objectives of 
this study were to explore global distributions and diversity of anammox bacteria and to 
identify factors that influence their biogeography. Over 6000 anammox 16S rRNA gene 
sequences from the public database were analyzed in this current study. Data ordinations 
indicated that salinity was an important factor governing anammox bacterial distributions, 
with distinct populations inhabiting natural and engineered ecosystems. Gene phylogenies 
and rarefaction analysis demonstrated that freshwater environments and the marine water 
column harbored the highest and the lowest diversity of anammox bacteria, respectively. 
Co-occurrence network analysis indicated that Ca. Scalindua strongly connected with 
other Ca. Scalindua taxa, whereas Ca. Brocadia co-occurred with taxa from both known 
and unknown anammox genera. Our survey provides a better understanding of ecological 
factors affecting anammox bacterial distributions and provides a comprehensive baseline 
for understanding the relationships among anammox communities in global environments. 
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INTRODUCTION 

The anaerobic ammonia oxidation (anammox) process converts 
ammonia to N2 gas by using nitrite as electron acceptor under 
anoxic conditions (van de Graaf et al., 1995). This process is 
important for removing fixed N from both engineered and nat- 
ural systems and can be applied to wastewater treatment in 
order to replace conventional treatment systems. Anammox is 
cost effective and environmentally friendly because it does not 
require aeration or organic carbon inputs, and reduces the pro- 
duction of greenhouse gases (i.e., N2O and CO2) compared to 
conventional denitrification (Jetten et al., 1997; van Dongen et al., 
2001); anammox was first implemented in a full-scale wastew- 
ater treatment plant (WWTP) in Rotterdam, Netherlands (van 
Dongen et al., 2001; Abma et al, 2007; van der Star et al, 2007). 
Although anammox bacteria were first discovered in WWTPs 
and their applications have been studied worldwide, they may 
account for more than 50% of N loss from marine environments 
(Arrigo, 2005; Francis et al., 2007). However, recent reports esti- 
mate that anammox bacteria contribute ~23-30% to N loss from 
marine environments (Trimmer and Engstrom, 2011; Dalsgaard 
et al., 2012; Babbin et al., 2014). The contributions of anam- 
mox bacteria to biogeochemical N2 production were measured 
as 18-36% in groundwater (Moore et al., 2011), 4-37% in paddy 
soils (Zhu et al, 2011), 9-13% in lakes (Schubert et al., 2006), 
and 1-8% in estuaries (Trimmer et al, 2003). These results 



indicate that anammox bacteria play a key role in the global 
N cycle. 

Anammox bacteria branch deeply within the Plantomycetes 
phylum. There are five known anammox genera, with 16 species 
proposed to date. The first discovered anammox bacterium was 
Ca. Brocadia anammoxidans, enriched from a denitrifying flu- 
idized bed reactor (Mulder et al, 1995; Kuenen and Jetten, 2001). 
The three characterized species within the Ca. Brocadia genera 
are Ca. Brocadia fulgida (Kartal et al., 2008), Ca. Brocadia sinica 
(Oshiki et al., 2011), and Ca. Brocadia caroliniensis (Rothrock 
et al, 2011); all of these were enriched in anammox bioreac- 
tors. The only species reported within the Candidatus Kuenenia 
genus is Ca. Kuenenia stuttgartiensis, which was isolated from a 
trickling filter biofilm (Schmid et al., 2000). The Ca. Scalindua 
genus consists of nine proposed species, six of which were dis- 
covered in marine environments (Kuypers et al., 2003; Woebken 
et al, 2008; Hong et al, 201 la; Fuchsman et al, 2012; Dang et al, 
2013; van de Vossenberg et al., 2013). Ca. Scalindua sorokinii was 
the first anammox species found in a natural environment (the 
Black Sea; Kuypers et al., 2003). Ca. Scalindua richardsii was also 
recovered from the Black Sea (Fuchsman et al., 2012). Although 
these two species originated from the Black Sea, they domi- 
nated in different zones. A cluster associated with Ca. Scalindua 
sorokinii was detected in the lower suboxic zone where ammo- 
nium concentration was high, but nitrite concentration was low, 
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whereas a cluster associated with Ca. Scalindua richardsii was 
found in the upper suboxic zone where ammonium concentra- 
tion was low, but nitrite concentration was high (Fuchsman et al., 

2012) . Ca. Scalindua brodae and Ca. Scalindua wagneri were 
both identified in WWTPs (Schmid et al, 2003). Ca. Scalindua 
arabica originated in the Arabian Sea and the Peruvian oxygen 
minimum zone (OMZ; Woebken et al., 2008). Ca. Scalindua 
pacifica (Dang et al., 2013) and Ca. Scalindua profunda (van 
de Vossenberg et al., 2013) were retrieved from the Bohai Sea 
and a marine sediment of a Swedish fjord, respectively. Two 
additional species names were tentatively proposed from molec- 
ular surveys: Ca. Scalindua sinooilfield from a high temperature 
petroleum reservoir (Li et al., 2010) and Ca. Scalindua zhenghei 
from marine sediments (the South China Sea; Hong et al., 201 la). 
The only known species affiliated with the Ca. Anammoxoglobus 
genus was Ca. Anammoxoglobus propionicus, enriched from an 
anammox reactor (Kartal et al., 2007). Ca. Jettenia asiatica was 
retrieved from a granular sludge anammox reactor (Quan et al., 
2008). Notably, known anammox bacteria species have mostly 
been discovered in engineered environments, but they have com- 
monly been detected in various natural ecosystems and are more 
widespread than previously thought. However, it should be noted 
that Ca. Scalindua sinooilfield and Ca. Scalindua zhenghei are 
not in the category Candidatus on the list of prokaryotic names 
with standing in the nomenclature (LPSN) website. The classi- 
fication and nomenclature of anammox Ca. species need to be 
better clarified and standardized in the future. 

Observations of anammox bacterial diversity have 
demonstrated that Ca. Brocadia, Ca. Kuenenia, and Ca. 
Anammoxoglobus were commonly found in non-saline envi- 
ronments (i.e., Egli et al., 2001; Moore et al., 2011; Hu et al., 

2013) , whereas Ca. Scalindua dominated saline environments 
(i.e., Woebken et al., 2008; Hong et al., 2011a; Villanueva et al., 

2014) , including deep-sea methane seep sediments (Shao et al., 
2014). Anammox bacteria have also been detected in extremely 
saline-related environments, including hydrothermal vents 
(Byrne et al, 2009; Russ et al., 2013), and cold hydrocarbon-rich 
seeps (Russ et al., 2013). However, because all previous molecular 
surveys of the anammox 16S rRNA genes were from individual 
studies of specific habitats, the overall understanding of global 
anammox bacterial diversities, distributions, and co-occurrences 
among lineages remains unclear. 

Factors affecting anammox bacterial diversity and distribution 
have been investigated within individual habitat-specific studies. 
For example, organic carbon influenced anammox diversity in 
freshwater sediment (Hu et al., 2012b), soil (Shen et al., 2013), 
and an estuary (Hou et al, 2013). Ammonium and nitrite con- 
centrations correlated with anammox diversity in a mangrove 
sediment (Li et al, 2011). Temperature impacted anammox com- 
munities in freshwater sediment (Osaka et al., 2012) and an 
estuary (Hou et al., 2013). Depth affected anammox diversity 
in marine sediment (Li et al., 2013). However, no comprehen- 
sive survey has previously explored factors that govern global 
anammox distributions. 

The main objectives of this study were to investigate global 
anammox bacterial distributions and identify factors influenc- 
ing anammox bacterial distributions and diversity. Over 6000 



anammox 16S rRNA gene sequences from Genbank were col- 
lected and analyzed by both phylogenetic and multivariate sta- 
tistical methods. An anammox 16S rRNA gene phylogenetic tree 
revealed broad anammox distributions across habitats, including 
marine sediment, marine water column, estuary, mangrove sedi- 
ment, soil, freshwater, freshwater sediment, groundwater, reactor, 
WWTP, marine sponge, biofilter, fish gut, shrimp pond, and oil 
field. Co-occurrence analysis demonstrated strong relationships 
among dominant anammox phylotypes. Global distributions 
of anammox bacteria revealed factors that influence anammox 
bacterial distributions, with salinity being the most important 
environmental variable. This study provides a better understand- 
ing of the prevalence of anammox bacterial 16S rRNA genes 
across habitats and the key factors impacting their distribution 
patterns. 

MATERIALS AND METHODS 
DATA COLLECTION AND PREPARATION 

All anammox 16S rRNA gene sequences available in Genbank 
were extracted on October 25th, 2013. In total, 14,790 potential 
anammox-related sequences were collected using the following 
keyword searches: "uncultured planctomycete 16S ribosomal RNA 
gene" "anammox bacterium 16S ribosomal RNA gene" "anaer- 
obic ammonium- oxidizing bacterium 16S ribosomal RNA gene" 
"Candidatus Brocadia 16S ribosomal RNA gene" "Candidatus 
Scalindua 16S ribosomal RNA gene" "Candidatus Kuenenia 16S 
ribosomal RNA gene" "Candidatus Anammoxoglobus 16S ribo- 
somal RNA gene" and "Candidatus Jettenia 16S ribosomal RNA 
gene." Most anammox bacterial 16S rRNA gene sequences were 
deposited in the Genbank with the definition "uncultured planc- 
tomycete 16S ribosomal RNA gene" (data not shown). However, 
this keyword-based search retrieved both anammox and non- 
anammox sequences. All collected sequences were searched by 
BLAST against known anammox species in Genbank core ref- 
erence set and aligned by QIIME vl.7 (Caporaso et al., 2010) 
using Infernal (Nawrocki and Eddy, 2013) against the Greengenes 
database (May 2013 revision; DeSantis et al., 2006) to screen 
for anammox-related sequences. After removing non-anammox 
and low quality sequences, over 6000 sequences from >200 
isolation sources were included in the analysis. All anammox 
sequences from across many specific "Isolation source" Genbank 
designations were assigned to 15 general habitats: marine sedi- 
ment, marine water column, estuary, freshwater sediment, fresh- 
water, groundwater, soil, mangrove sediment, WWTP, reac- 
tor, marine sponge, biofilter, fish gut, oil field, and shrimp 
pond. 

Limitations of this analysis included metadata inconsistencies 
and missing environmental parameters across multiple studies. 
Consequently, metadata were qualitatively grouped into three 
broad categories: salinity (saline, mixed, and non-saline environ- 
ments), ecosystem (natural and engineered), and habitat (listed 
above). Another limitation was that it was not possible to con- 
sistently determine relative abundances of anammox sequences 
within each study due to inconsistencies with reporting, sam- 
pling efforts, and methodologies. To address this shortcoming, all 
anammox 16S rRNA gene sequences were clustered into opera- 
tional taxonomic units (OTUs) at 97% identity with cd-hit-est 
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v4.5.4 (Fu et al., 2012) and the abundance of each anammox OTU 
was only counted as present or absent for each study. 

STATISTICAL AND MULTIVARIATE ANALYSES 

Individual studies that contributed anammox 16S rRNA gene 
sequences were usually associated with unique Genbank isolation 
sources. Because of this, the numbers of anammox 16S rRNA gene 
sequences contributed per study and/or unique isolation source 
were broad, ranging from 1 to 623 sequences. In order to ensure 
that dissimilarity matrices were generated from datasets derived 
from the same number of sequences from each study, multi- 
ple rarefied datasets were generated that varied in the number 
of sequences derived from each study/isolation source. In cases 
where multiple studies represented compatible isolation sources, 
yet with relatively low numbers of sequences, these sequence data 
were pooled into additional isolation source categories to maxi- 
mize habitat representation in the rarefied analyses. Subsequently, 
we tested datasets rarefied to 10, 40, or 100 sequences from each 
isolation source category. 

After clustering the sequences at 97% identity, all sequences 
were aligned and trimmed in order to consider a single homol- 
ogous spanning region of the 16S rRNA gene, which corre- 
sponded to the positions 384-834 of Escherichia coli (J01695.2; 
Brosius et al, 1978). Any sequences with less than 100 bases 
after trimming were discarded from the analysis. Consequently, 
the sequences from some isolation sources within five minor 
habitats (marine sponge, biofilter, fish gut, oil field, and shrimp 
pond) fell below the threshold for rarefied datasets 40 and 100 
sequences. All five of these minor habitats were removed from fur- 
ther analysis. The minimum sequence threshold remained at 10, 
40, and 100 after being trimmed. Consequently, 10 major habitats 
(marine sediment, marine water column, estuary, freshwater sedi- 
ment, freshwater, groundwater, soil, mangrove sediment, WWTP, 
reactor) were considered in this analysis. 

Principal coordinates analysis (PCoA) ordinations were gen- 
erated from unweighted UniFrac distance matrices (Lozupone 
and Knight, 2005) through QIIME (Caporaso et al., 2010). 
Non-metric multidimensional scaling (NMDS) ordinations were 
calculated based on a Jaccard dissimilarity matric, using the 
AXIOME pipeline (Lynch et al, 2013). To test treatment effects 
and within-group agreement, multi-response permutation pro- 
cedures (MRPP) were tested on 999 permutations, using the 
R library vegan (Oksanen et al, 2008) from within AXIOME. 
Analyzed data for each rarefied dataset (10, 40, and 100 
sequences), including the OTU table with taxonomic classifi- 
cations and the analyzed sequences, a mapping file, and the 
source FASTA files, are in a single compressed Supplementary 
Material file ("Sonthiphand supp data files.zip"). All collected 
sequences, with corresponding Genbank accession numbers and 
metadata, are provided in a spreadsheet (sequences.xlsx) within 
the Supplementary Material. 

RAREFACTION CURVE AND DIVERSITY INDICES 

Rarefaction curves, observed species, phylogenetic diversity (PD), 
Chaol, and Shannon indices were generated by QIIME (Caporaso 
et al., 2010). The Wilcoxon Signed-rank test was performed by the 
R function wilcoxAest (R Core Team, 2013). The null hypothesis 



was that the number of OTUs between habitats was the same. If p 
was < 0.05, the null hypothesis was rejected. 

PHYLOGENETIC CONSTRUCTION 

Representative sequences for each OTU from each habitat 
were selected for phylogenetic analysis. A total of 505 OTU 
sequences from across all 15 habitats included all know anam- 
mox Candidatus species. Outgroups included cultured non- 
anammox species of Planctomycetales, including Planctomyces 
maris (X62910), Isophaera sp. (X81958), Gemmata obscuriglohus 
(X85248), Blastopirellula marina (HE861893), Rhodopirellula 
baltica (FI624346), and Pirellula sp. (X81942). Sequences were 
aligned using MUSCLE (Edgar, 2004) and trimmed to a final 
homologous length of ~310 bases. A maximum likelihood tree 
was constructed with the PhyML v.3.0.1, using the GTR model 
(Guindon and Gascuel, 2003). The tree topology was optimized at 
five random starts. The approximate likelihood ratio test (aLRT) 
was conducted to provide tree topology support. The phyloge- 
netic tree was visualized by SEAVIEW (Galtier et al, 1996). 

CO-OCCURRENCE NETWORK ANALYSIS 

Anammox sequences were sorted by habitat and an OTU table 
was generated by AXIOME. Co-occurrence was assessed using a 
previously described method (Barberan et al., 2012). All single- 
tons were discarded, and OTUs having a Spearman's correlation 
> 0.8 were considered to have a strong co-occurrence relation- 
ship. Spearman's correlation was used because it only checks if 
two OTUs are monotonically related, rather than having a lin- 
ear relationship. As a result, it is less sensitive to differences in 
abundance, and this was desirable because abundance informa- 
tion may have been lost when the sequences were deposited in 
GenBank, as described above. The results were visualized with 
Gephi (Bastian et al, 2009). 

RESULTS 

DISTRIBUTIONS OF ANAMMOX BACTERIA ACROSS HABITATS 

Anammox sequences were collected from multiple studies and 
isolation sources. The number of sequences was considerably 
different from one isolation source to another. Three rarefied 
sequence collections were generated to compare distribution pat- 
terns. Because the broad range of analyzed sequences (10-623 
sequences) affected dissimilarity measurements, we chose to ana- 
lyze set 40 in more detail to include as many isolation sources as 
possible in our analysis while maximizing sequence sample size 
(Figure 1). This was done because set 10 (i.e., 10 sequences per 
isolation source) showed poor groupings with low correlations 
(data not shown) and both set 40 and set 100 showed similar 
distribution patterns with high correlations (Figure 2). 

All anammox sequences from 10 habitats were visualized 
within an ordination plot based on phylogenetic distances 
by using an unweighted UniFrac distance matrix (Figure 1). 
The percentage of PCoA principal coordinates (PCI and PC2) 
explained 46% variability among all samples. The ordina- 
tion demonstrated that anammox sequences clustered sig- 
nificantly by habitat (Figure IB), which was supported by 
MRPP (T = —7.6, A = 0.14, p < 0.001; Figure 2). All anammox 
sequences clustered separately into two main groups (Figure IB). 
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FIGURE 1 | Principal coordinate analysis (PCoA) ordination based on an 
unweighted UniFrac distance matrix of anammox bacterial 16S rRNA 
gene profiles. The taxonomic biplot information for all panels is represented in 



(A). Panels (B-D) show distributions of anammox isolation source 
representation (points) colored by habitat, salinity, and ecosystem, 
respectively. The proportion of the variation explained is indicated on the axes. 



Marine sediment, marine water column, estuary, and mangrove 
sediment grouped together and were dominated by Ca. Scalindua 
cluster (Figures 1A,B)- The WWTP, reactor, soil, freshwater, 
freshwater sediment, and groundwater grouped together and 
were dominated by Ca. Brocadia, Ca. Jettenia, and the unknown 
cluster. Four samples, one each from freshwater, freshwater sedi- 
ment, soil, and WWTP, were present in both groups. 

KEY FACTORS AFFECTING GLOBAL ANAMMOX BACTERIAL 
DISTRIBUTION 

The strongest separation of anammox bacterial sequences was 
linked to sample salinity (Figure 1C), which we assigned qualita- 
tively as saline, "mixed," and non-saline environments. The mixed 
environments were generally river-marine transitional zones, 
mostly from mangrove and estuary habitats. Saline and mixed 



environments clustered together and differed significantly from 
non-saline environment (Figures 1C, 2B; T = —12.1, A = 0.09, 
p < 0.001). However, a few non-saline samples grouped with 
saline and mixed samples. The Ca. Scalindua cluster was clearly 
dominant in saline environments but almost never detected in 
non-saline environments (Figures 1A,C). The major comple- 
ment of anammox bacteria found in non-saline environment 
was Ca. Brocadia, Ca. Jettenia, and the unknown clusters. The 
results indicated that salinity was the key factor governing global 
distributions of anammox bacteria. 

DISTINCT ANAMMOX BACTERIA IN NATURAL AND ENGINEERED 
ECOSYSTEMS 

Another factor that showed a significant correlation with the 
anammox bacterial distributions was ecosystem type. Although 
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FIGURE 2 | Non-metric multidimensional scaling (NMDS) plots of 


sequences; all sequences were from 44 isolation sources. The second 


anammox 16S rRNA gene sequences. The correlations of habitat, salinity, 


column (D-F) shows the datasets rarefied to 100 sequences; all sequences 


and ecosystem were calculated by a Jaccard dissimilarity metric. The first 


were from 25 isolation sources. The significance of group separations [A, T, 


column (A-C) shows the ordination results for datasets rarefied to 40 


and p) are indicated within each ordination. 



most anammox sequences were from natural ecosystems, those 
from engineered ecosystems grouped together (Figures ID, 2C; 
T = —9.1, A = 0.05, p < 0.001). However, one sample from a 
WWTP grouped separately from other samples of engineered 
ecosystems (Figure ID). This WWTP sample contained very few 
anammox sequences associated with Ca. Scalindua cluster. More 
robust group separation was visualized by the NMDS generated 
from an OTU-based Jaccard distance metric (Figures 2C,F). This 
observation demonstrated environmental selection of anammox 
bacteria in natural and engineered ecosystems. 



DIVERSITY RICHNESS OF ANAMMOX BACTERIA 

Rarefaction curves and diversity indices showed that freshwa- 
ter possessed the highest anammox bacterial diversity, whereas 
the marine water column was associated with the lowest diver- 
sity (Figure 3 and Table 1). The diversity of anammox bacteria 
in freshwater and marine water column differed significantly 
(p = 0.01). The diversity of anammox bacteria in freshwater 
and freshwater sediment was not significantly different (p = 
0.22). Rarefaction curves of freshwater showed no saturation, 
although only 170 sequences were analyzed. The majority of 
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FIGURE 3 | Rarefaction curves of the anammox bacterial 16S rRNA 
gene diversity among sampled habitats. OTUs were generated at 97% 
identity. The p-values for differences among all habitats are shown in 
Table 2. 



Table 1 | The number of anammox sequences and diversity indices 
for each habitat. 

Habitat Total collected Total analyzed Diversity indices 





sequences 


sequences 


PD 


Chad 


Shannon 


Marine 


2046 


1921 


1.11 


35.2 


2.66 


sediment 












Marine water 


325 


324 


0.74 


10.3 


1.07 


column 












Estuary 


1365 


1347 


0.99 


53.3 


3.66 


Freshwater 


479 


473 


1.64 


35.5 


3.59 


sediment 












Freshwater 


170 


170 


2.26 


103.5 


3.93 


Groundwater 


472 


126 


0.55 


13.7 


2.06 


Soil 


816 
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0.78 


28.0 


3.33 


Mangrove 


366 


339 


1.22 


42.4 


3.30 


sediment 












WWTP 


288 


249 


1.30 


36.3 


2.77 


Reactor 


420 


355 


1.10 


22.9 


2.51 



PD, Phylogenetic diversity. 



freshwater anammox sequences were from unpublished data; only 
a few publications reported anammox bacterial 16S rRNA gene 
sequences from freshwater (Schubert et al., 2006; Hamersley et al., 
2009; Pollet et al, 2011; Han and Gu, 2013; Sonthiphand and 
Neufeld, 2013). Consequently, more research on anammox bac- 
teria in freshwater would be required to confirm this observation. 
Overall, the results imply that most novel anammox clusters 
remain undiscovered within freshwater habitats. 

The diversity of anammox bacteria in marine sediments was 
higher than in the marine water columns (p = 0.02; Figure 3, 
Table 1). The reason for this observation might be higher physical 
and biogeochemical heterogeneity in marine sediments, asso- 
ciated with a greater overall microbial diversity (Table 1). The 
diversity of anammox bacteria among other isolation source sam- 
ples, including freshwater sediment, estuary, mangrove sediment, 



soil, and marine sediment, showed no significant differences 
(Figure 3, Table 2). The diversity of anammox bacteria in engi- 
neered ecosystems, including WWTPs and reactors, were not 
significantly different (p = 0.15), consistent with the observa- 
tion that anammox bacteria from engineered ecosystems grouped 
together (Figures ID, 2C,F). 

Although groundwater, freshwater, and freshwater sediment 
were non-saline isolation sources, the diversity of groundwater 
was low and significantly different from freshwater (p = 0.01) 
and freshwater sediment (p = 0.01; Table 1). However, the inter- 
pretation of this observation must be cautious because only a 
few publications have surveyed anammox bacterial 16S rRNA 
gene sequences in groundwater (Hirsch et al., 2011; Moore et al., 
2011; Sonthiphand and Neufeld, 2013). Only 126 sequences were 
included in this analysis; however, 472 anammox sequences were 
collected from Genbank (Table 1). The majority of groundwa- 
ter anammox sequences were from contaminated groundwater in 
Canada (Moore et al., 2011), and most sequences were excluded 
due to the region of analyzed 16S rRNA genes being outside of the 
region used to generate a phylogenetic tree, which was the basis of 
this analysis. 

PHYL0GENY AND CO-OCCURRENCE OF ANAMMOX BACTERIA 

The dominant anammox phylotypes recovered from across all 
isolation sources were Ca. Scalindua and Ca. Brocadia, in addi- 
tion to lower abundance anammox phylotypes, including Ca. 
Kuenenia, Ca. Anammoxoglobus, and Ca. Jettenia (Figure 4A). 
The unknown cluster comprised of 76 OTUs; however, the aver- 
age sequences per OTU were only 1.78 sequences. There was 
no majority of anammox sequences per OTU for the unknown 
cluster, reflecting that the unknown anammox clusters were 
likely low abundance but high diversity anammox bacteria, pos- 
sibly representing part of the rare biosphere of these isolation 
sources. 

Approximately 70% of total Ca. Scalindua OTU sequences 
were from saline-related environments, including marine sedi- 
ment, marine water column, estuary, and mangrove sediment 
(Figure 4B). Ca. Scalindua was also detectable in soil and 
freshwater- related environments, representing 13 and 8% of all 
anammox OTUs from those isolation sources, respectively. 

Ca. Brocadia was most commonly retrieved from non- 
saline environments, including freshwater sediment, freshwater, 
groundwater, and soil (Figure 4B). All freshwater-related envi- 
ronments and soil accounted for 38 and 24% of Ca. Brocadia 
OTU sequences, respectively. Engineered ecosystems, includ- 
ing WWTP and reactor, accounted for 15% of Ca. Brocadia 
OTU sequences. Although 16% of Ca. Brocadia OTU sequences 
were recovered from estuary isolation sources, only 1% of these 
OTUs were associated with marine sediment (Figure 4B). No 
Ca. Brocadia sequences were detected in marine water column 
data. 

Ca. Kuenenia was the third most abundant cluster found 
across all isolation sources (Figure 4A). This cluster was detected 
across nine of the main habitats, but not the marine water column 
(Figure 4B). Ca. Kuenenia was also found in all five minor habi- 
tats, including marine sponge, biofilter, fish gut, shrimp pond, 
and oil field. Although Ca. Kuenenia was present in almost all 
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Table 2 | Significance of richness differences among 10 habitats, calculated by the Wilcoxon Signed-rank test. 
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habitats, a few OTUs (1-3 OTUs) per habitat were discovered. 
This observation indicated that Ca. Kuenenia cluster was not 
ubiquitous, but still widespread across habitats. 

The Ca. Anammoxoglobus cluster was distributed similarly 
to the Ca. Brocadia cluster across isolation sources. For exam- 
ple, soil and freshwater-related environments accounted for 32 
and 28% Ca. Anammoxoglobus OTU sequences (Figure 4B), 
respectively (compare to 24 and 38% for Ca. Brocadia, respec- 
tively). Estuary, WWTP, and reactor equally accounted for 14% of 
total Ca. Anammoxoglobus OTUs. Marine sediment and marine 
water column samples did not contribute OTUs from the Ca. 
Anammoxoglobus cluster. 

The lowest abundance of known anammox bacterial genera 
was Ca. Jettenia, which comprised only eight OTUs (Figure 4A). 
Although Ca. Jettenia was not commonly detected within most 
isolation sources, the majority of this cluster was retrieved 
from engineered ecosystems, including WWTPs and reactors 
(Figure 4B). These engineered isolation sources accounted for 
51% of all recovered Ca. Jettenia OTUs. Freshwater sediment, 
groundwater, and soil equally accounted for 13% of total Ca. 
Jettenia OTUs. None of Ca. Jettenia OTUs were associated with 
saline-related environments (Figure 4B). 

The distributions of anammox bacterial OTUs of the unknown 
cluster were relatively similar to those of the Ca. Scalindua cluster 
(Figure 4B). The majority of sequences found in this cluster was 
from saline environments, including marine sediment, marine 
water column, estuary, and mangrove sediment; they accounted 
for 57% of the unknown OTU sequences. Freshwater, freshwater 
sediment, soil, and WWTPs accounted for 12, 9, 7, and 5% of 
unknown OTU sequences, respectively. As with the Ca. Scalindua 
cluster, the unknown cluster was present across nine of the main 
habitats, but not found in groundwater. 

Co-occurrence patterns suggested that Ca. Scalindua OTUs 
correlated very well with other Ca. Scalindua OTUs (Figure 5). 
In some cases, Ca. Scalindua was found together with Ca. 
Brocadia, Ca. Kuenenia, and OTUs from the additional unknown 
cluster. Strong co-occurrences of Ca. Scalindua with Ca. 
Anammoxoglobus and Ca. Jettenia were not observed. Ca. 
Brocadia OTUs within the co-occurrence network were corre- 
lated with OTUs spanning all known genera and the unknown 
anammox cluster (Figure 5). Ca. Anammoxoglobus correlated 



consistently with Ca. Brocadia, indicating a close relationship 
between OTUs of these two genera. Although eight OTUs of Ca. 
Jettenia were reported (Figure 4A), singleton OTUs were removed 
from this network analysis. Only one main Ca. Jettenia OTU 
formed part of a co-occurrence network (Figure 5). A Ca. Jettenia 
OTU correlated with a Ca. Anammoxoglobus OTU, and these 
linked to a Ca. Brocadia OTU. Overall, the resulting network 
revealed the close relationships among OTUs of Ca. Jettenia, Ca. 
Anammoxoglobus, and Ca. Brocadia clusters. The closest co- 
occurring genus to Ca. Kuenenia was Ca. Brocadia (Figure 5). 
The co-occurrence of Ca. Kuenenia with Ca. Scalindua and one 
OTU of the unknown cluster was also observed. 

DISCUSSION 

Based on an ordination analysis and a non-parametric anal- 
ysis of the distance matrix, we confirmed that salinity is the 
dominant factor governing the global distribution of anammox 
bacteria (Figures 1A,C). These results are not surprising given 
that within-study correlation analyses have previously demon- 
strated that salinity influenced the geographical distribution of 
anammox bacteria in estuary and marsh sediments (Dale et al., 
2009; Hu et al., 2012a; Hou et al, 2013). Ca. Scalindua domi- 
nated saline environments, including marine sediment, marine 
water column, estuary, and mangrove sediment. The compre- 
hensive phylogenetic analysis also supported that ~70% of Ca. 
Scalindua were from saline environments (Figures 4A,B). These 
results are consistent with previous observations that a lab-scale 
bioreactor community dominated by Ca. Kuenenia shifted toward 
Ca. Scalindua dominance after being enriched in high salt con- 
centrations for 360 days (Kartal et al., 2006). In addition, salinity 
showed negative correlations with Ca. Scalindua diversity in the 
Bohai Sea sediment (Dang et al., 2013), which would be consistent 
with the low overall diversity we observed for saline environments 
surveyed here (Figure 3). 

Although there is no pure anammox culture available so 
far, comparative metagenomic studies of Ca. Kuenenia (Strous 
et al, 2006; Speth et al, 2012), Ca. Brocadia (Gori et al, 2011), 
Ca. Jettenia (Hu et al., 2012c), and Ca. Scalindua (van de 
Vossenberg et al., 2013; Villanueva et al., 2014) revealed that Ca. 
Scalindua has unique characteristics that support marine envi- 
ronment adaptations. Ca. Scalindua has high-affinity ammonium 
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FIGURE 4 | Phylogeny and composition of anammox bacteria. (A) A I6S 

rRNA based-phylogenetic tree of representative anammox OTU sequences 
from 15 habitats: marine sediment, marine water column, estuary, freshwater, 
freshwater sediment, groundwater, reactor, WWTP marine sponge, biofiiter, 
fish gut, shrimp pond, and oil field. The OTUs were generated at 97% identity. 



The OTU sequences grouped into five known anammox clusters and one 
unknown cluster. The numbers of OTUs and anammox sequences were 
shown in the bracket of each cluster. (B) Annotated habitat representation 
within six anammox clusters. "Others" represent five minor habitats, 
including marine sponge, biofiiter, fish gut, shrimp pond, and oil field. 



transport (amtB) and formate/nitrite transport (focA) proteins; 
both genes are highly expressed compared to those present in 
other anammox species (van de Vossenberg et al., 2013). These 
characteristics help Ca. Scalindua adapt to marine environments 
where ammonium and nitrite may be limited (Lam and Kuypers, 
2011). So far, only Ca. Scalindua is known to contain genes 
involved in dipeptide and oligopeptide transport with moderate 
expression (van de Vossenberg et al, 2013). Consequently, Ca. 



Scalindua has an alternative ammonium source from degraded 
and mineralized organic matter. Ca. Scalindua also has a relatively 
versatile metabolism. Ca. Scalindua can use NO^, NO^, and 
metal oxides as alternative electron acceptors (van de Vossenberg 
et al, 2008, 2013). In the presence of organic acids (i.e., propi- 
onate, acetate, formate), Ca. Scalindua can perform dissimilatory 
nitrate reduction to ammonia (DNRA; Jensen et al., 2011). Lipid 
assays demonstrated that ladderane lipids with three cyclobutane 
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(Figure 4A). Some of the OTUs were excluded from the network prior 
to the analysis because of differing 16S rRNA gene regions contained 
within the analysis. Node sizes represent the number of connections 
and edge width represents correlation strength. 



rings and one cyclohexane ring may be specific to Ca. Scalindua 
(Kuypers et al, 2003, 2005; van de Vossenberg et al, 2008). 
However, this unique lipid structure may or may not facilitate Ca. 
Scalindua being dominant in marine environments. The specific 
function of this lipid needs further biochemical assays to verify. 

Previous research demonstrates that salinity impacts not only 
the distribution patterns and diversity of anammox bacteria but 
also their abundance and activity. The abundance of anammox 
bacteria increased with the salinity gradients in Cape Fear River 
estuary (Dale et al., 2009) and Yangtze estuary (Hou et al., 
2013). In contrast to their abundance, the activity of anam- 
mox bacteria was negatively correlated with salinity (Trimmer 
et al, 2003; Rich et al, 2008; Koop-Jakobsen and Giblin, 2009). 
However, salinity can be linked with other factors such as 
NO^, NH^, vegetation zones, and relative contribution to den- 
itrifiers; it can be difficult to confirm the independent effect 
of salinity on anammox activity (Koop-Jakobsen and Giblin, 
2009). 



The links between salinity and anammox bacterial distribu- 
tions are also more broadly observed for other microorganisms 
within a broad range of habitats. For example, the abundance 
and diversity of ammonia oxidizing bacteria (AOB) and archaea 
(AOA) were affected by salinity (Francis et al., 2003; Santoro et al., 
2008; Biller et al., 2012). The diversity of denitrifying bacteria in 
WWTP systems was affected by salinity (Yoshie et al., 2004) and 
inhibitory effects of salinity on nitrification and denitrification 
rates were observed in estuary sediment (Rysgaard et al., 1999). 
Not only does salinity affect the distributions of specific groups of 
microorganisms, salinity impacted community fingerprints and 
species richness estimates for Bacteria, Archaea, and Eukaryotes 
within a solar saltern in Spain (Casamayor et al., 2002). The bac- 
terial community composition along an estuary shifted due to a 
salinity gradient (Crump et al., 2004). Statistical and multivariate 
approaches have also confirmed salinity as the key factor driving 
global distribution patterns of Bacteria (Lozupone and Knight, 
2007) and Archaea (Auguet et al, 2010). 
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Ordinations, including PCoA (Figure ID) and NMDS 
(Figures 2C,F), showed that anammox bacteria from natural 
ecosystems formed clusters apart from those of engineered 
ecosystems. This observation suggests environmental selection 
of anammox bacteria in natural and engineered ecosystems. 
Reasons for this finding include differences in the physiological 
properties of anammox bacteria, including specific growth 
rate {\l max ), affinity for ammonia and nitrite (K s ), optimum 
growth temperature, and pH. The physiological properties of 
Ca. Kuenenia stuttgartiensis (Egli et al., 2001; van der Star et al., 
2008a,b), Ca. Brocadia anammoxidans (Strous et al, 1998, 
1999; Jetten et al., 2005), and Ca. Brocadia sinica (Oshiki et al., 
2011) are now characterized. These physiological properties 
demonstrate that Ca. Brocadia sinica adapts better to engineered 
ecosystems because of a lower affinity for ammonia and nitrite, 
higher tolerance to O2, and higher growth rate (Oshiki et al., 
2011). Engineered ecosystems are typically associated with high 
ammonia and nitrite loads. Wastewater treatment technologies 
apply O2 to facilitate AOB activity so that the coexistence of 
anammox bacteria and AOB transforms fixed N to N2 gas (Third 
et al, 2001; van Dongen et al, 2001). 

After being enriched in fluctuating nitrite concentrations, a 
Ca. Brocadia dominated community shifted to a Ca. Kuenenia 
dominated community due to differences in affinity for NO^ 
(van der Star et al., 2008a). Ca. Scalindua from marine envi- 
ronment changed to Ca. Brocadia and Ca. Kuenenia after being 
enriched in a bioreactor (Nakajima et al., 2008). Either Ca. 
Brocadia or Ca. Kuenenia was commonly dominant in lab-scale 
bioreactors (Egli et al, 2001; Hu et al, 2010; Park et al, 2010). 
In this study, network co-occurrence analysis showed that Ca. 
Brocadia and Ca. Kuenenia OTUs are correlated with one other 
(Figure 5). However, more research on physiological properties, 
including kinetic and biochemical analyses, of other anammox 
species are needed to better understand niche differentiation of 
anammox bacteria in different ecosystems. 

Although the diversity of anammox sequences from marine 
water column and marine sediment was significantly differ- 
ent (Figure 3, Table 2), marine environments harbored a low 
overall diversity of anammox bacteria, mostly restricted to Ca. 
Scalindua (i.e., Schmid et al., 2007; Woebken et al., 2008; Hong 
et al., 201 la,b). A microdiversity within Ca. Scalindua was previ- 
ously discovered in marine OMZs, comprising several subclusters 
(Woebken et al, 2008). The microdiversity of Ca. Scalindua was 
also found in other marine environments, including the South 
China Sea (Hong et al, 2011a; Han and Gu, 2013), the Jiaozhou 
Bay (Dang et al, 2010), the Bohai Sea (Dang et al, 2013), the 
Columbian Pacific (Castro-Gonzalez et al., 2014), and deep-sea 
methane seep sediments in the Okhotsk Sea (Shao et al, 2014). 
The novel subclusters, Ca. Scalindua zhenghei and Ca. Scalindua 
pacifica, were tentatively proposed after being identified in the 
South China Sea (Hong et al., 2011a) and the Bohai Sea (Dang 
et al., 2013), respectively. Ca. Scalindua showed strong connec- 
tions within its cluster but relatively low connectivity to other 
known anammox clusters (Figure 5). This observation reflected 
the microdiversity within Ca. Scalindua cluster. However, co- 
occurrence of Ca. Scalindua and OTUs from the unknown cluster 
was high and consistent, reflecting the close relationship between 



the two. The unknown cluster might be a second dominant clus- 
ter found in marine environments that has yet to be assigned to a 
genus-level designation. 

In contrast to marine environments, freshwater environments 
showed a high diversity of anammox bacteria. The coexistence of 
Ca. Brocadia with known and unknown anammox clusters was 
generally found in previously reported freshwater habitats (Zhang 
et al., 2007; Hamersley et al, 2009; Hirsch et al, 2011; Yoshinaga 
et al, 2011; Hu et al, 2012b; Sonthiphand and Neufeld, 2013). 
However, one dominant anammox phylotype, Ca. Brocadia, was 
detected in the sediments of the Dongjiang River, Hong Kong 
(Sun et al, 2014), Lake Taihu, China (Wu et al, 2012), and the 
Grand River, Canada (Sonthiphand et al, 2013). Network anal- 
ysis also showed that Ca. Brocadia clusters connected to OTUs 
from all known genera and the unknown cluster (Figure 5). 
Ca. Scalindua was solely detected in Lake Tanganyika, which 
is meromictic with a sharp chemocline (Schubert et al., 2006). 
Overall, Ca. Brocadia OTUs were found in all previously reported 
freshwater habitats, except Lake Tanganyika. 

As with other freshwater environments, Ca. Brocadia was the 
major anammox phylotype detected in contaminated ground- 
water. However, Ca. Kuenenia, Ca. Jettenia, Ca. Scalindua, and 
OTUs from the unknown cluster were also present (Moore et al., 
2011). However, most of sequences from this study were removed 
from this current analysis, resulting in low diversity richness and 
underestimation of anammox phylotypes in groundwater. There 
is insufficient groundwater-specific information due to a paucity 
of anammox groundwater surveys to date. We recommend fur- 
ther surveys of ammonia-rich groundwater isolation sources for 
obtaining a better understanding of anammox bacterial diversity 
in in this important low-oxygen and N-rich habitat. 

The transitional zone between freshwater and marine environ- 
ments, including estuary and mangrove sediment, is a dynamic 
habitat. River-sea interactions (i.e., river runoff, ocean tides, and 
inflow/outflow) possibly enhance the diversity of anammox bac- 
teria. The mixture of known and unknown anammox clusters was 
evident in estuary habitats (Dale et al., 2009; Hirsch et al., 2011; 
Hu et al., 2012a; Hou et al., 2013) and mangrove sediment (Han 
and Gu, 2013; Li and Gu, 2013; Wang et al., 2013). 

The combination of anammox OTUs associated with Ca. 
Brocadia, Ca. Kuenenia, Ca. Anammoxoglobus, and Ca. Jettenia 
was also found in various soil types, including peat soil (Hu et al., 
2011), fertilized paddy soil (Zhu et al, 2011), a flooded paddy 
soil (Hu et al., 2013), and an agricultural soil (Shen et al., 2013). 
However, a single anammox phylotype was reported in some 
other soil types. Ca. Jettenia was recovered from manure pond 
soil (Sher et al, 2012) and permafrost soil (Humbert et al., 2010). 
Ca. Kuenenia was also detected in rhizosphere soil (Humbert 
et al., 2010). Interestingly, a rice paddy soil was dominated by Ca. 
Scalindua (Wang and Gu, 2013). The difference in soil proper- 
ties (i.e., nutrients, O2, and pH) and depth reflected a microniche 
of anammox bacteria within terrestrial habitats (Zhu et al, 2011; 
Sher etal, 2012). 

Our findings revealed the global distributions and diversities 
of anammox bacteria. These results added to previous knowledge 
about the geographical distributions and abundances of anam- 
mox bacteria in various environments, including marine (Dang 
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et al, 2013; Shao et al, 2014), estuary (Hu et al, 2012a; Hou 
et al., 2013), soil (Sher et al., 2012), and freshwater (Sonthiphand 
and Neufeld, 2013; Sun et al, 2014). The abundances of anam- 
mox bacteria in marine sediments were positively correlated 
with marine water depth (Jaeschke et al., 2010; Sokoll et al., 
2012; Trimmer et al., 2013; Shao et al., 2014). Low tempera- 
ture likely favored the abundance of anammox bacteria in marine 
sediments (Russ et al, 2013). Consequently, anammox bacteria 
likely play a key role in the deep sea, where temperature is usu- 
ally low (Jaeschke et al, 2010; Shao et al, 2014). In contrast 
to marine environments, the abundance of anammox bacteria 
showed a negative correlation with soil depth (Sher et al., 2012). 
The suggested reason for higher anammox bacterial abundance 
in surface soils was higher nutrient availability in upper layers 
compared to bottom layers of the soil profile. Substrate availabil- 
ity (NO^ and NH^) influenced anammox bacterial abundance 
in marine (Dang et al., 2010), estuary (Hou et al., 2013), fresh- 
water (Wu et al, 2012; Sun et al, 2014), and soil (Shen et al., 
2013) environments. Because NO^ can be generated from NO^ 
reduction, NO^ concentration also affected the anammox bac- 
terial abundance in estuary sediments (Hu et al., 2012a) and 
marine sediments (Han and Gu, 2013). In additional to quantify- 
ing the abundance of anammox bacteria, their activity must still 
be assessed in many of the above habitats to better understand 
their contributions to ecosystem N loss as part of the global N 
cycle. 

CONCLUDING REMARKS 

The global distribution pattern of anammox bacteria is con- 
trolled primarily by salinity. Distinct partitioning of anammox 
bacterial communities among natural and engineered ecosystems 
was also observed in our sequence survey. Insufficient infor- 
mation on anammox genomes and physiological properties is 
available to draw conclusions on how extrinsic factors (i.e., salin- 
ity, NH^, NO^) affect possible anammox bacterial mechanisms. 
More additional metagenomic studies of other anammox species 
will help compare and contrast the specific genes and their func- 
tions that influence the distribution and co-occurrence of anam- 
mox bacteria. Further investigations on kinetic and biochemical 
properties of more anammox species are needed to better under- 
stand the ecological niche partitioning of anammox bacteria. 
Freshwater is a promising habitat in which to discover novel 
anammox species and groundwater, in particular, may be an ideal 
study habitat for discovering anammox bacterial contributions 
to N loss in freshwater-related environments. Multidisciplinary 
approaches, including both metagenomic studies and molecu- 
lar anammox surveys, are needed to fill in missing knowledge 
gaps. 
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